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ARTICLE INFO ABSTRACT 


Accumulating evidence indicates that MERS-CoV originated from bat coronaviruses (BatCoVs). Previously, we 
MERS-CoV demonstrated that both MERS-CoV and BatCoV HKU4 use CD26 as a receptor, but how the BatCoVs evolved to 
BatCoV HKUS bind CD26 is an intriguing question. Here, we solved the crystal structure of the S1 subunit C-terminal domain 
Crp of HKU5 (HKU5-CTD), another BatCoV that is phylogenetically related to MERS-CoV but cannot bind to CD26. 
SS ieiemeaias We observed that the conserved core subdomain and those of other betacoronaviruses (betaCoVs) have a similar 

topology of the external subdomain, indicating the same ancestor of lineage C betaCoVs. However, two deletions 
in two respective loops located in HKU5-CTD result in conformational variations in CD26-binding interface and 
are responsible for the non-binding of HKU5-CTD to CD26. Combined with sequence variation in the HKU5- 
CTD receptor binding interface, we propose the necessity for surveilling the mutation in BatCoV HKU5 spike 
protein in case of bat-to-human interspecies transmission. 


Keywords: 


1. Introduction 


Coronaviruses (CoVs) are spherical enveloped viruses with single 
positive-strand RNA genomes of ~30 kb in length, which is the largest 
among RNA viruses (Saif, 1993). CoVs are divided into four genera: 
alpha-, beta-, gamma-, and deltaCoVs (de Groot et al., 2013). BetaCoVs 
are further subdivided into four lineages/subgroups: A, B, C, and D 
(Chan et al., 2015). To date, both alpha- and betaCoVs are found to 
infect humans (Lu et al., 2015), causing subclinical or very mild 
symptoms and accounting for 10-15% of common colds (Heikkinen 
and Jarvinen, 2003). In addition, CoVs can also be life-threatening and 
have pandemic potential. The epidemic of severe acute respiratory 
syndrome coronavirus (SARS-CoV), which belongs to lineage B of the 
betaCoVs, originated in southern China in 2002 and spread to 28 


countries, infecting over 8000 and leading to almost 800 related deaths 
(WHO, 2004). The outbreak of MERS-CoV, a member of the lineage C 
betaCoVs (Cotten et al., 2013; Zaki et al., 2012), has caused 1832 
laboratory-confirmed cases since 2012, including 651 related deaths as 
of Nov. 28, 2016 (WHO, 2016). Unlike the SARS-CoV, which suddenly 
disappeared after a massive global disease control effort, especially in 
China, the number of MERS-CoV infections is still on the rise. 
Mounting evidence indicates that CoVs circulating in bats 
(BatCoVs) are the gene sources of alphaCoVs and betaCoVs (W. Li 
et al., 2005; Woo et al., 2012), including SARS-CoV (Ge et al., 2013; 
Lau et al., 2005; W. Li et al., 2005). The data also underscore that bats 
are the likely natural reservoir for MERS-CoV (Annan et al., 2013; 
Ithete et al., 2013; Memish et al., 2013; Wang et al., 2014; Yang et al., 
2014). For instance, viral gene fragments identical or quite similar to 
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those of MERS-CoV have been reported in bats (Annan et al., 2013; 
Ithete et al., 2013; Memish et al., 2013). Moreover, parallel studies 
from our group and others show that BatCoV HKU4, grouped in 
lineage C with MERS-CoV, can also use human CD26 (hCD26; the 
receptor of MERS-CoV) for viral entry (Wang et al., 2014; Yang et al., 
2014). In other words, two members in lineage C use the same human 
receptor. One has caused a human infection epidemic (MERS-CoV), 
and the other can utilize the same receptor (BatCoV HKU4) and has 
potential to infect humans. This highlights the necessity of surveillance 
for lineage C betaCoVs, including BatCoV HKU5, which was first 
sequenced in 2006 in Japanese pipistrelles (Pipistrellus abramus) 
(Woo et al., 2006) and is circulating in bats (Lau et al., 2013). Whether 
the virus has the potential to bypass the bat-human barrier needs to be 
evaluated. 

CoV infections initiate with the virus binding to the host receptor. 
The envelope-interspersed trimeric spike (S) protein plays a pivotal 
role in this process. S is further divided into two parts: S1, responsible 
for receptor binding, and S2, which initiates fusion (Belouzard et al., 
2012; Dai et al., 2016; Kielian and Rey, 2006). S1 contains two 
relatively independent structures named the N-terminal domain 
(NTD) and C-terminal domain (CTD) based on their position. Most 
betaCoVs use the CTD as the receptor-binding domain (RBD/CTD) 
except mouse hepatitis virus (MHV), which uses the NTD to bind the 
cellular receptor carcinoembryonic-antigen-related cell-adhesion mo- 
lecule 1 (CEACAM1) (Dai et al., 2016; Du et al., 2009; F. Li et al., 2005; 
Lu et al., 2013; Peng et al., 2011). Two of the RBD/CTDs in lineage C 
betaCoVs (MERS-RBD/CTD and HKU4-RBD/CTD) bind to the same 
human receptor CD26 (hCD26) to initiate infection, and the two 
domains share high sequences identities (55%) in addition to high 
structural similarities, with a root mean square deviation (rmsd) of 
1.114 (193 Ca) (Lu et al., 2013; Wang et al., 2014). Despite the similar 
sequence identities between HKU5-CTD and MERS-RBD/CTD (54%) 
or HKU4-RBD/CTD (57%), no detectable binding was found between 
HKU5-CTD and hCD26. The structural basis for this variation remains 
to be elucidated. 

In this study, we determined the structure of HKU5-CTD. Similar to 
other solved structures, HKU5-CTD contains two subdomains: the core 
subdomain homologous to other CTDs in betaCoVs and the external 
subdomain, which resembles MERS-RBD/CTD and HKU4-RBD/CTD, 
indicating conservation of the external domain in lineage C. However, 
two deletions in HKU5-CTD lead to structural shifts in the hCD26- 
interaction interface and thereby result in its inability to bind this 
receptor. Our results suggest that the characteristic insertions in Bc4 
and Bc5S among different lineages in betaCoVs result in different 
receptor engagement, thereby contributing for potential interspecies 
transmission. 


2. Results 
2.1. Overall structure of the HKU5-CTD 


We first characterized the S protein of BatCoV HKU5 through 
bioinformatics analysis. BatCoV HKU5 S is composed of 1352 amino 
acids and exhibits typical features of CoVs S protein, including the 
predicted hydrophobic residue-rich HR1 and HR2 motifs and a similar 
concentration of hydrophobic amino acids to SARS-CoV fusion peptide 
(FP), internal fusion peptide (IFP), and pre transmembrane domain 
(PTM) (Gao et al., 2013; Mahajan and Bhattacharjya, 2015; Xu et al., 
2004; Zhu et al., 2004). Like MERS-CoV S protein, a furin-like protease 
recognition motif is predicted at position R745/A746 (S1/S2), which 
separates the S1 and S2 subunits (Millet and Whittaker, 2014). In 
addition, a second furin cleavage site can be found at R884/S885, 
which resembles S2’ in the MERS-CoV S protein (Millet and Whittaker, 
2014), indicating that the priming process of BatCoV HKU5 S in 
human cells probably occur in the same way like MERS-CoV (Fig. 1A). 
Because most betaCoVs use their CTD to bind their respective 
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receptors, we next focused on the evolutionary relationships of the 
CTDs. Consistent with the phylogenetic relationships, HKU5-CTD, 
HKU4-RBD/CTD, and MERS-RBD/CTD are grouped in one branch 
representing lineage C, while HKU1-CTD and MHV-CTD cluster 
together in lineage A. HKU9-CTD, a member in lineage D, is 
phylogenetically more related to SARS-RBD/CTD, which belongs to 
lineage B (Fig. 1B). 

The HKU5-CTD was then purified, crystallized and the structure 
was successfully determined at a resolution of 2.1 A, with clear electron 
densities tracing from Q389 to Q586. The structure, solved through the 
molecular replacement method, contains a single molecule in the 
crystallographic asymmetric unit, with an Ryo, of 0.2160 and an 
Reee Of 0.2585 (Table 1). Like the other solved CTD structures of 
betaCoVs (Huang et al., 2016; Kirchdoerfer et al., 2016; F. Li et al., 
2005; Lu et al., 2013; Walls et al., 2016; Wang et al., 2013, 2014), 
HKU5-CTD folds into two discrete subdomains, the core and the 
external (Fig. 2). The core subdomain contains a five-stranded anti- 
parallel scaffold center (core-center), which is decorated by five helices 
(a or 319) and two small strands (Bp1 and Bp2) on the exterior. Three 
pairs of disulfide bonds help to stabilize the scaffold, namely C391- 
C415 and C445-C583, located in the peripheral region of the core 
subdomain (core-peripheral), and C433-C486 in the core-center, 
linking Bc2 and Bc4. Notably, two antiparallel 6 strands, one of which 
is located in the C-terminus and the other forming the disulfide bond 
with the N-terminus, help to make keep two termini in proximity. In 
addition, traceable electron densities can be observed for two glycosy- 
lated modifications at N418 and N495, which form two protrusions at 
the core-peripheral region (Fig. 2). 

The external subdomain of HKU5-CTD extends out of Bc4 in the 
core-center, sequentially folds into two antiparallel B strands (B1’ and 
62’), an a helix (H1’), and another two antiparallel 6 strands (63’ and 
64’), and finally proceed into Bc5. Between 61’ and H1’, a pair of 
disulfide bonds (C511-C532) is formed to stabilize the external 
structure (Fig. 2). 


2.2. Conserved core subdomain and variable external subdomain for 
betaCoVs S protein 


To date, seven structures of CTDs covering all four lineages of 
betaCoVs have been solved. They are HKUI-CTD and MHV-CTD, 
belonging to lineage A (Kirchdoerfer et al., 2016; Walls et al., 2016), 
MERS-RBD/CTD (Lu et al., 2013; Wang et al., 2013), HKU4-RBD/ 
CTD (Wang et al., 2014), and HKU5-CTD grouped in lineage C 
(reported here), and SARS-RBD/CTD (F. Li et al., 2005) and HKU9- 
CTD (Huang et al., 2016) representing lineages B and D, respectively. 
All seven betaCoV CTD structures display a conserved core subdomain, 
with five antiparallel beta strands and a conserved disulfide bond 
between Bc2 and fc4 (Fig. 3). 

Despite the different combinations of a helices and £ strands, the 
orientations of the secondary structures are conserved among CTDs in 
the core-peripheral region. In addition, two highly conserved disulfide 
bonds exist. One is formed between the N-terminus and the loop/6 
strand extended from Bcl, and the other links the C-terminus and the 
loop/B strand proceeding to Bc3. Thus, through the two disulfide 
bonds, both termini are brought into close proximity (Fig. 3). Although 
in the SARS-RBD/CTD electron density at the C-terminus is not clear 
enough to determine the structures (Fig. 3C), two conserved cysteines 
are present, indicating the probability of disulfide bond formation 
(Fig. 1C). 

Opposite to the conserved core subdomain, the external subdomain 
varies considerably among different lineages. In lineage A, the external 
subdomain of MHV-CTD, which was obtained by density-guided 
homology modelling due to its large flexibility and poor quality of the 
density in this region, consists of four 6 strands and three small helices 
(PDB code: 3CJL) (Fig. 3A). HKU1-CTD is comprised of a large, 
variable loop with three inlaid B strands (Fig. 3B). The absence of clear 
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Fig. 1. Sequence features of HKU5-CTD. (A) Schematic representation of BatCoV HKU5 S. The indicated domain elements were defined through pairwise sequence alignments or 
bioinformatics analyses. The signal peptides (SP), transmembrane domain (TM), and heptad repeats 1 and 2 (HR1 and HR2, respectively) were predicted with the SignalP 4.0 server, 
TMHMM server, and Learncoil-VMF program, respectively, while the NTD, RBD, and fusion peptides (FP, IFP, and PTM) were deduced by alignment with the N-terminal galectin-like 
domain of murine hepatitis virus S, MERS-RBD/CTD, and SARS-CoV S, respectively. The $1/S2 and S82’ sites potentially cleaved by furin-like proteases were predicted using the ProP 
1.0 server. (B) Phylogenetic tree generated using MEGA with the indicated RBD/CTD sequences. (C) Structure-based sequence alignment. The secondary structure elements are defined 
based on an ESPript algorithm and are labeled as in Fig. 2. The arrows and spiral line indicate strands and helices, respectively. The conserved cysteine residues that form four disulfide 
bonds in the structures are marked with Arabic numerals 1—4. The beta strands in the core-center, the elements in the core-peripheral, and the structures in the external domains are 
marked with c, p, and the character with prime, respectively. The blue triangle and green star represent the key amino acids for binding hCD26 in MERS-RBD/CTD and HKU4-RBD/ 
CTD, respectively, while the yellow rhombus indicates two glycosylation sites in HKU5-CTD. 


secondary structure from residues C476-F572 indicates the flexibility code: 2GHV) (Fig. 3C). In BatCoV HKU9, representing lineage D, the 


of this region (PDB code: 5108). In lineage C, three CTDs show similar external subdomain only contains one large helix in this region (PDB 
external folds, with rmsd ranging from 0.962 (HKU5-CTD us. MERS- code: 5GYQ) (Fig. 3G). 

RBD/CTD (PDB code: 4KQZ)) to 1.178 (HKU5-CTD vs. HKU4-RBD/ Although their external subdomain structures differ, all CTDs in 
CTD (PDB code: 4QZV)). All external subdomains are _ strand- betaCoVs extend out from Bc4 and proceed back to the core subdomain 
dominated structures with four anti-parallel 6 strands and expose a through Bc5 (Fig. 3), indicating that during evolution, different 
flat strand-face that is stabilized by a conserved disulfide bond (Fig. insertions in this region resulted in the different structures of the 


3D-F). In lineage B, the SARS-RBD/CTD is dominated by a disulfide CTDs. This, then, led to different receptor usage if the CTD is utilized as 
bond-stabilized flexible loop that connects two small 6 strands (PDB the RBD. 
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Table 1 
Data collection and refinement statistics. 
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Refinement 

Resolution (A) 

No. reflections 
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Protein 
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Water 

B-factors 
Protein 
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R.m.s. deviations 
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2.3. Structural basis for HKU5-CTD not binding to CD26 


Both MERS-RBD/CTD and HKU4-RBD/CTD bind to hCD26 to 
initiate infection. In addition, the structure of the HKU5-CTD displays 
a similar topology to the two RBD/CTDs in lineage C. Thus, we assayed 
for binding between HKU5-CTD and hCD26. However, consistent with 
previous results, no binding was detected, either by fluorescence- 
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activated cell sorting (FACS) or surface plasmon resonance (SPR) 
(Fig. 4). 

According to the two solved complex structures, four concentra- 
tions of residues in MERS-RBD/CTD and HKU4-RBD/CTD are 
involved in binding to hCD26. These residues located in four beta 
strands and two loops ($1’/B2’ loop and B3’/B4’ loop) in the external 
subdomain as well as H4 and H5 (for MERS-RBD/CTD) or H5 and H6 
(for HKU4-RBD/CTD) positioned in the core subdomain and the loop 
connecting the two helices (Figs. 1C, 3E and 3F). However, half of these 
regions (B1’/62’ loop and 63’) have deletions in HKU5-CTD (Fig. 1C). 
Due to these deletions, the orientations of two loops (marked 1 and 2, 
respectively, in Fig. 3D-F) in HKU5-CTD vary compared to MERS- 
RBD/CTD and HKU4-RBD/CTD, which leads to conformational shifts 
in HKU5-CTD at the hCD26-binding interface (Fig. 5A and E). The 
B1’/B2’ loop in both MERS-RBD/CTD and HKU4-RBD/CTD inserts 
into the groove formed by two helices on the side and £ strands on the 
bottom (Fig. 5B and F). Sixty-five (328 in total) and 49 (214 in total) 
van der Waals contacts, including 5 (16 in total) and 4 (13 in total) 
hydrogen bonds, are formed in MERS-RBD/CTD/hCD26 and HKU4- 
RBD/CTD/hCD26, respectively. In contrast, this loop in HKU5-CTD is 
tilted away by ~6 and 9 A (Fig. 5B and F) compared to MERS-RBD/ 
CTD and HKU4-RBD/CTD, respectively, which results in the loss of 
binding to hCD26 at this region. 

Moreover, a six-residue deletion in B3’ causes large discrepancies in 
the assemblies of 63’, 64’ and their connecting loop, compared with 
MERS-RBD/CID and HKU4-RBD/CID (Fig. 3D-F). In 83’ of both 
hCD26-binding RBD/CTDs, the side chains of Y540 and R542 face the 
receptor, conferring a strong hydrophilic interaction between the ligand 
and the receptor. In contrast, in HKU5-CTD, the orientation of £3’ is 
opposite. In addition, Y544 in HKU5-CTD likely sterically clashes Q286, 
which further pushes HKU5 away from hCD26 (Fig. 5C and G). In the 
other beta strand of 64’, both MERS-RBD/CTD and HKU4-RBD/CTD 
form a large hydrophobic interaction patch with hCD26. In HKU5-CTD, 
aside from the shift of the B3’/B4’ loop away from the receptor, the 
deletion of hydrophobic residues (e.g., W535) compared to MERS-RBD/ 
CTD (Fig. 5D) and the substitution of hydrophilic residues (e.g., T553 and 
T555) instead of hydrophobic ones (1560 and V562 in HKU4-RBD/CTD) 
(Fig. 5H) likely inhibit HKU5-CTD binding to hCD26. In total, the 
conformational variations between HKU5-CTD and hCD26-binding 
RBD/CTDs explain the lack of hCD26 binding by HKU5-CTD. 
However, various deletions in HKU5-CTD loop I are present in nature 
(Fig. 6), and might contribute to evolve for receptor binding. 
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Fig. 2. Crystal structure of the HKU5-CTD. The core and external subdomains are colored orange and magenta, respectively. The core subdomain is further divided into a center 


region (core-center) and a peripheral region (core-peripheral). The core-center strands and helices are labeled Bcl-Bc5 and H1-H5, respectively, while the core-peripheral strands are 
marked Bp1 and Bp2. The glycan-moieties are displayed in sticks and marked in the left panel. The disulfide bonds are presented in spheres and labeled in the right panel. Both N- and C- 


terminus are indicated with the arrows. To depict the structure clearly, cartoon structures inside the transparent surface are presented at two angles. 
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Fig. 3. Topological diagrams of CTDs in betaCoVs. Structural and topological comparison of available betaCoV CTD structures. Seven structures, including those of MHV-CTD 
(PDB code: 3CJL), HKU1-CTD (PDB code: 5108), SARS-RBD/CTD (PDB code: 2GHV), HKU5-CTD, HKU4-RBD/CTD (PDB code: 4QZV), MERS-RBD/CTD (PDB code: 4KQZ), and 
HKU9-CTD (PDB code: 5GYQ) were oriented similarly and are presented as cartoons in parallel. The conserved disulfide bonds are labeled in red lines, while the non-conserved ones are 
displayed with lines in accordance with the color of indicated external subdomain. Arrows and cylinders represent the strands and helices, respectively. 


3. Discussion 


In this study, we solved the crystal structure of HKU5-CTD, which 
represents the seventh structure of a CTD belonging to a betaCoV. Like 
the other six CTDs (Huang et al., 2016; Kirchdoerfer et al., 2016; F. Li 
et al., 2005; Lu et al., 2013; Peng et al., 2011; Walls et al., 2016; Wang 
et al., 2013, 2014), there are two subdomains in HKU5-CTD, the core 
and the external. Despite the low residue conservation among CTDs 
(pair-to-pair amino acid identity ranging from 17.2% to 58.7%) and the 
core subdomains (pair-to-pair amino acid identity ranging from 16.6% 
to 66.7%) in the four lineages, the topology of the latter ones are highly 
conserved, with five anti-parallel B strands constituting the core-center 
and the same orientation of secondary elements in the core-peripheral 
(Fig. 3). This includes the same region of MHV, which uses the NTD of 
S1 to bind the receptor. 

However, the external subdomains vary considerably among 
lineages. In lineage A, the MHV-CTD contains several B strands and 
inlaid helices, while the HKU1-CTD is comprised of loops and three 
small B strands. However, approximately 100 amino acids (C4'76-F572) 
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are unclear at this region, likely due to their flexibility. SARS-RBD/ 
CTD, in lineage B is dominated by loops, which are stabilized by a 
disulfide bond and two anti-parallel 8 strands. Most CTD structures 
solved to date are in lineage C, and all three CTDs (MERS-RBD/CTD, 
HKU4-RBD/CTD, and HKU5-CTD) display conserved structures with 
68 strand-forming platforms decorated with helices. In addition, a 
disulfide bond is conserved among CTDs in lineage C in the external 
subdomain. HKU9-CTD, a member of lineage D, is comprised of a helix 
that is clamped with loops. Although different structures and topolo- 
gies exist among lineages, all external subdomains extend out from Bc4 
and proceed back to Bc5 (Fig. 3), indicating that different insertions 
between Bc4/B6c5 during betaCoV evolution have conferred the 
betaCoVs with different properties, such as receptor usage, and thereby 
led to the parallel evolution of lineages. 

The ligand-receptor interaction is a key factor determining the 
tissue tropism and host range of CoVs. For SARS-CoV, MERS-CoV, and 
BatCoV HKU4, the receptors are clear, and the complex structures 
demonstrate that the receptor mainly binds to the varied external 
subdomains of CTDs. Neutralizing antibodies against HCoV HKU1 
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Fig. 4. Characterization of HKU5-CTD by FACS and SPR. (A) Huh7 cells were stained with MERS-RBD/CTD (green), HKU4-RBD/CTD (orange), and HKU5-CTD (red), 
respectively. (B-D) BHK21 cells transfected with hCD26 (BHK-hCD26) were stained with MERS-RBD/CTD (B), HKU4-RBD/CTD (C), and HKU5-CTD (D), respectively. (E-F) The 
indicated protein was immobilized on a CM5 chip, and a gradient concentration of hCD26 was flowed through the chip. The RUs were recorded. (E) hCD26 binding to HKU5-CTD. (F) 
hCD26 binding to MERS-RBD/CTD. (G) hCD26 binding to HKU4-RBD/CTD. (H) The saturation profile for HKU4-RBD/CTD binding to hCD26. 


Fig. 5. Structural basis for the lack of binding between HKU5-CTD and hCD26. Superimposition of the structures of HKU5-CTD and hCD26 binding-MERS-RBD/CTD (A-D) 
or HKU4-RBD/CTD (E-H). The variations in the receptor binding interface of HKU5-CTD compared with MERS-RBD/CTD or HKU4-RBD/CTD are allocated with B-D and F-H and 
further delineated in B-D and F-H for detailed structural shifts, respectively. The conserved core subdomains in HKU5-CTD, MERS-RBD/CTD, and HKU4-RBD/CTD are colored in 
grey, while the external subdomains of the three proteins are marked with orange, cyan, and wheat, respectively. The magenta represents hCD26. 


bind to the HKUI1-CTD, not the HKU1-NTD (Qian et al., 2015), Huang et al., 2015). HKU9-CTD does not bind to ACE2 or hCD26 


indicating that the CTD in HCoV HKU1 is most likely to be the RBD, (Huang et al., 2016). In our study, we found that although HKU5-CTD 
though the protein receptor has yet to be identified (Chan et al., 2016; displays a similar structure and topology to MERS-RBD/CTD and 
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Fig. 6. CTDs of HKU5 show diversities. All referred sequences in HKU5 CTD regions were analyzed by alignment. The two black boxes indicate the sequences of loops 1 and 2 


marked in Fig. 3C. 


HKU4-RBD/CTD, detailed structural analysis revealed variations at the 
hCD26-binding interface, which results in the loss of binding to this 
receptor. Thus, subtle distinctions in external subdomains could 
determine different receptor usage. 

In addition to receptor binding, the priming process, which involves 
host proteases to liberate S2 and the fusion peptides from the otherwise 
covalently-linked S1 subunit, is another key factor affecting cell 
tropism and the entry route of CoVs. A two-step activation mechanism 
has been proposed for MERS-CoV entry (Millet and Whittaker, 2014). 
During the secretion of S protein, the proteolysis at S1/S2 occurs in the 
endoplasmic reticulum (ER)-Golgi compartments where furin is loca- 
lized, while during virus entry into target cells, S2’ is cleaved. The same 
proteolysis in S1/S2 and S2’ is also essential for SARS-CoV infection, 
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except that due to the lack of a furin-recognition site at S1/S2, SARS- 
CoV S remains uncleaved after biosynthesis and a diverse array of 
proteases are involved in this process (Millet and Whittaker, 2015; 
Simmons et al., 2011). In contrast, although BatCoV HKU4 can utilize 
hCD26 as a receptor, the proteolysis is stuck due to the lack of a 
protease site. Treatment of pseudovirus particles containing BatCoV 
HKU4 S protein with trypsin or importing the furin-recognition site 
into S protein enables the particles to infect hCD26-expressing cells 
(Wang et al., 2014; Yang et al., 2015), indicating the BatCoV HKU4 is 
less adapted to human cells. However, in the BatCoV HKU5 S protein, 
both furin-recognition sites are present, as in MERS-CoV. Accordingly, 
BatCoV HKUS5 S is predicted to be cleaved at S1/S2 after biosynthesis 
and at S2’ during virus entry. Furin is a ubiquitous proteinase and 
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expressed in nearly all cells lines. The presence of the two furin- 
recognition sites indicates that the priming process of BatCoV HKU5 is 
ready to occur. 

BatCoV HKU5 has been circulating in bats (Lau et al., 2013). In an 
epidemiology study over a 7-year period (April 2005 to August 2012), 
25% of alimentary specimens of Japanese pipistrelle bats (Pipistrellus 
abramus) collected from 13 locations in Hong Kong were positive for 
this virus (Lau et al., 2013), indicating that it might target gastro- 
intestinal tissues. However, BatCoV HKU5 virus has not been isolated 
and cultured successfully, which is an obstacle to virus transmission 
research. The problem is largely due to a lack of suitable cell lines for 
BatCoV HKU5 virus. The emerging but puzzling question is whether 
this virus could infect humans or not. 

An infectious clone of BatCoV HKU5 containing the ectodomain 
from the SARS-CoV S protein was constructed through reverse genetics 
and synthetic-genome design, and the recombinant virus replicates 
efficiently in cell culture and in young and aged mice (Agnihothram 
et al., 2014). In addition, the key proteins for virus replication, such as 
the 3C-like protease, polymerase, and exonuclease of BatCoV HKU5 
display high amino acid sequence similarity to those in MERS-CoV, 
indicating that once the genome of BatCoV HKU5 is released into host 
cells, genome replication, virus particle assembly, and release can 
readily occur. Therefore, the receptor would be the last barrier for 
BatCoV HKU5 to infect humans. Our data show that BatCoV HKU5- 
CTD does not use hCD26 as a receptor, though it folds into a very 
similar structure as MERS-RBD/CTD and HKU4-RBD/CTD. In other 
words, the cellular receptor of BatCoV HKU5 is still a mystery that 
requires further study. 

Evolutionally, BatCoV HKU5 S protein is more diverse than BatCoV 
HKU4 S protein (Lau et al., 2013), and various deletions in loop 1 
(Fig. 3D and Fig. 6) have been sequenced. This indicates that BatCoV 
HKU5 is able to generate variants to occupy new ecological niches and 
might acquire the ability to bind to hCD26 by accumulating mutations 
and ultimately cause human respiratory infections like MERS-CoV and 
SARS-CoV. Accordingly, it is very important to perform long-lasting 
surveillance of BatCoV HKU5 evolution, especially the variety of S 
protein in the event that the virus breaks the inter-species and/or inter- 
tissue transmission barriers. 


4. Materials and methods 
4.1. Gene construction, protein expression, and purification 


The coding region for HKU5-CTD (Q389-Q586) with a 6x His-tag at 
its C-terminus was cloned into the EcoRI and XhoI sites of pFastBac. 
HKU5-CTD, MERS-RBD/CTD, HKU4-RBD/CTD, and hCD26 were 
expressed and purified as previously reported (Lu et al., 2013; Wang 
et al., 2014). Briefly, the proteins were expressed in baculovirus- 
infected Hi5 cells. After 48h of culturing, the supernatants were 
collected and purified through a 5mL HisTrap HP column (GE 
Healthcare). The bound protein was eluted by a gradient of imidazole. 
Fractions containing the target protein as determined by SDS-PAGE 
were pooled and further subjected to gel filtration using a Superdex 75 
column (GE Healthcare) in a buffer composed of 20 mM Tris-HCl (pH 
8.0) and 50 mM NaCl. 

The Fc fusion protein used for cell staining was purified following a 
previously published method (Lu et al., 2013; Wang et al., 2014). In 
brief, the plasmid containing the target gene was transfected into 
HEK293T cells. After 3 and 7 d of culturing, the supernatants 
containing secreted protein were pooled and subjected to HiTrap 
ProteinA chromatography (5 mL, GE Healthcare). Protein was eluted 
with 0.1 M sodium citrate (pH 4.5) and further purified by gel 
filtration. The protein was finally buffer-exchanged into PBS, concen- 
trated to ~1 mg/mL, and stored at —80 °C before further usage. 


108 


Virology 507 (2017) 101-109 


4.2. Crystallization, data collection, and structure determination 


The HKU5-CTD protein expressed in Hi5 cells was crystallized by 
sitting drop vapor diffusion at 18 °C. Diffractable crystals were obtained in 
a condition consisting of 0.2 M potassium thiocyanate and 20% (w/v) 
polyethylene glycol 3350 with a protein concentration of 15 mg/mL. 
Crystals were cryoprotected in reservoir solution containing 20% (v/v) 
glycerol and flash-frozen in liquid nitrogen. Diffraction data were collected 
at Shanghai Synchrotron Radiation Facility (SSRF) BL17U and processed 
with HKL2000 (Otwinowski and Minor, 1997). 

The HKU5-CTD structure was solved by the molecular replacement 
method using Phaser (McCoy et al., 2007) from the CCP4 program suite 
(Winn et al., 2011) with the structure of HKU4-RBD/CTD (PDB: 5GJ4) as 
the search model. Initial restrained rigid-body refinement and manual 
model building were performed using REFMAC5 (Murshudov et al., 
2011) and COOT (Debreczeni and Emsley, 2012), respectively. Further 
refinement was performed using Phenix (Adams et al., 2010). Final 
statistics for data collection and structure refinement are represented in 
Table 1. Atomic coordinates and structure factors have been deposited in 
the Protein Data Bank with accession code 5XGR. 


4.3. SPR analysis 


The BlAcore experiments were performed at 25 °C using a BlIAcore 
3000 machine with CM5 chips (GE Healthcare). All proteins used in the 
experiment were expressed in insect cells. After purification, the protein 
was exchanged into PBS (pH7.4) containing 0.005% (v/v) Tween 20. The 
MERS-RBD/CTD, HKU4-RBD/CTD, and HKU5-CTD proteins were im- 
mobilized on the chip at ~1000 response units (RUs), respectively. Gradient 
concentrations of hCD26 (ranging from 0.195 to 200 uM) were then 
injected at 30 uL/min. After each cycle, the sensor surface was re-generated 
using 7 uL of 10 mM NaOH. Measurements from the reference flow cell 
(immobilized with BSA) were subtracted from experimental values. The 
binding kinetics were analyzed with BIAevaluation Version 4.1 software 
using the 1:1 Langumuir binding and/or the steady state affinity models. 


4.4. Flow cytometry assay 


Human hepatocellular carcinoma Huh7 or BHK-21 cells trans- 
fected with hCD26 (BHK-hCD26) were used for the binding test. Cells 
were stained with mouse Fc-fusion proteins, including MERS-RBD/ 
CTD, HKU4-RBD/CTD, and HKU5-CTD, expressed using HEK 293T 
cells. BHK-21 cells were transfected with hCD26-containing plasmids 
24 h before staining. Huh7 and BHK-hCD26 were suspended in PBS 
and incubated with individual proteins at 4 °C for 30 min, then washed 
and further incubated at 4 °C for another 30 min with an anti-mIgG/ 
PE antibody. After washing, the cells were analyzed using a BD 
FACSCalibur. The data were processed with FlowJo software. 
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